HITS' Monolingual and Cross-lingual Entity Linking System at TAC 2012: A Joint Approach
نویسندگان
چکیده
This paper presents HITS’ system for monolingual and cross-lingual entity linking at TAC 2012. We propose a joint system for entity disambiguation, recognition of NILs and clustering using Markov Logic. The proposed model (1) is global, i.e. a group of mentions in a text is disambiguated in one single step combining various global and local features, and (2) performs disambiguation, unknown entity detection and clustering jointly. The model for all languages is exclusively trained on English Wikipedia articles. The results achieved in the TAC monolingual and cross-lingual entity linking tasks show that our approach is competitive: our best English run achieves 8.5 percent points above median, while we outperformed all other participating systems in the Chinese cross-lingual subtask. The results for the Spanish subtask are lower due to a bug. Our unofficial Spanish results (after fixing the bug) are close to the ones of the best system.
منابع مشابه
HITS' Monolingual and Cross-lingual Entity Linking System at TAC 2013
This paper presents HITS’ system for monolingual and cross-lingual entity linking at TAC 2013. The system is an extended version of our last year’s joint entity disambiguation and clustering system based on Markov Logic Networks. We describe the new extensions and discuss the results. The results show that our approach is competitive across all three languages: with a micro-average accuracy of ...
متن کاملAnalysis and Refinement of Cross-Lingual Entity Linking
In this paper we propose two novel approaches to enhance cross-lingual entity linking (CLEL). One is based on cross-lingual information networks, aligned based on monolingual information extraction, and the other uses topic modeling to ensure global consistency. We enhance a strong baseline system derived from a combination of state-of-the-art machine translation and monolingual entity linking ...
متن کاملHITS' Cross-lingual Entity Linking System at TAC 2011: One Model for All Languages
This paper presents HITS’ system for crosslingual entity linking at TAC 2011. We approach the task in three stages: (1) context disambiguation to obtain a language-independent representation, (2) entity disambiguation, (3) clustering of the queries that have not been linked in the second step. For each of these steps one single model is trained and applied to both languages, i.e. English and Ch...
متن کاملCUNY-UIUC-SRI TAC-KBP2011 Entity Linking System Description
In this paper we describe a joint effort by the City University of New York (CUNY), University of Illinois at Urbana-Champaign (UIUC) and SRI International at participating in the mono-lingual entity linking (MLEL) and cross-lingual entity linking (CLEL) tasks for the NIST Text Analysis Conference (TAC) Knowledge Base Population (KBP2011) track. The MLEL system is based on a simple combination ...
متن کاملLinguistic Resources for Entity Linking Evaluation: from Monolingual to Cross-lingual
To advance information extraction and question answering technologies toward a more realistic path, the U.S. NIST (National Institute of Standards and Technology) initiated the KBP (Knowledge Base Population) task as one of the TAC (Text Analysis Conference) evaluation tracks. It aims to encourage research in automatic information extraction of named entities from unstructured texts with the ul...
متن کامل